端到端的学习模型表明,在执行语音隔离方面具有显着的能力。尽管它们在现实世界中广泛应用,但对他们对分组的机制并因此将单个说话者隔离开来知之甚少。在这项工作中,我们知道谐调是这些网络分组源的关键提示,我们对Convtasnet和DPT-NET进行了彻底的研究,以分析它们如何对输入混合物进行谐波分析。我们进行彻底研究,在其中应用低通,高通和带路的多个传球循环过滤器,以凭经验分析最重要的隔离谐波。我们还研究了这些网络如何通过引入合成混合物中的不连续性来决定将哪种输出通道分配给估计来源。我们发现,端到端网络非常不稳定,并且在面对人类无法察觉的变形时性能差。用频谱图替换这些网络中的编码器会导致整体性能降低,但稳定性更高。这项工作有助于我们理解这些网络依赖语音隔离的信息,并揭示了两种概括源。它还将编码器指定为负责这些错误的网络的一部分,从而可以重新设计专家知识或转移学习。
translated by 谷歌翻译
语音信号的多分辨率光谱特征代表大脑通过将皮质细胞调整为不同光谱和时间调制的方式来感知声音。这些功能会产生语音信号的较高维度表示。本文的目的是评估语音信号的听觉皮层表示对这些相应信号的估计发音特征的贡献。自从获得语音信号的声学特征的发音特征一直是不同语音社区感兴趣的主题,我们研究了将语音信号作为声学特征的多分辨率表示的可能性。我们使用威斯康星州X射线微束(XRMB)的清洁语音信号数据库来训练馈送前进的深神经网络(DNN),以估计六个区域变量的关节轨迹。使用适当的量表和速率向量参数选择了最佳的多分辨率光谱特征来训练模型,以获得最佳性能模型。实验与地面真相变量的相关性达到0.675。我们将该语音反演系统的性能与使用MEL频率曲线系数(MFCC)进行的先前实验进行了比较。
translated by 谷歌翻译
Light guide plates are essential optical components widely used in a diverse range of applications ranging from medical lighting fixtures to back-lit TV displays. In this work, we introduce a fully-integrated, high-throughput, high-performance deep learning-driven workflow for light guide plate surface visual quality inspection (VQI) tailored for real-world manufacturing environments. To enable automated VQI on the edge computing within the fully-integrated VQI system, a highly compact deep anti-aliased attention condenser neural network (which we name LightDefectNet) tailored specifically for light guide plate surface defect detection in resource-constrained scenarios was created via machine-driven design exploration with computational and "best-practices" constraints as well as L_1 paired classification discrepancy loss. Experiments show that LightDetectNet achieves a detection accuracy of ~98.2% on the LGPSDD benchmark while having just 770K parameters (~33X and ~6.9X lower than ResNet-50 and EfficientNet-B0, respectively) and ~93M FLOPs (~88X and ~8.4X lower than ResNet-50 and EfficientNet-B0, respectively) and ~8.8X faster inference speed than EfficientNet-B0 on an embedded ARM processor. As such, the proposed deep learning-driven workflow, integrated with the aforementioned LightDefectNet neural network, is highly suited for high-throughput, high-performance light plate surface VQI within real-world manufacturing environments.
translated by 谷歌翻译
As AI systems become more capable, we would like to enlist their help to supervise other AIs. We experiment with methods for training a harmless AI assistant through self-improvement, without any human labels identifying harmful outputs. The only human oversight is provided through a list of rules or principles, and so we refer to the method as 'Constitutional AI'. The process involves both a supervised learning and a reinforcement learning phase. In the supervised phase we sample from an initial model, then generate self-critiques and revisions, and then finetune the original model on revised responses. In the RL phase, we sample from the finetuned model, use a model to evaluate which of the two samples is better, and then train a preference model from this dataset of AI preferences. We then train with RL using the preference model as the reward signal, i.e. we use 'RL from AI Feedback' (RLAIF). As a result we are able to train a harmless but non-evasive AI assistant that engages with harmful queries by explaining its objections to them. Both the SL and RL methods can leverage chain-of-thought style reasoning to improve the human-judged performance and transparency of AI decision making. These methods make it possible to control AI behavior more precisely and with far fewer human labels.
translated by 谷歌翻译
Tumor segmentation in histopathology images is often complicated by its composition of different histological subtypes and class imbalance. Oversampling subtypes with low prevalence features is not a satisfactory solution since it eventually leads to overfitting. We propose to create synthetic images with semantically-conditioned deep generative networks and to combine subtype-balanced synthetic images with the original dataset to achieve better segmentation performance. We show the suitability of Generative Adversarial Networks (GANs) and especially diffusion models to create realistic images based on subtype-conditioning for the use case of HER2-stained histopathology. Additionally, we show the capability of diffusion models to conditionally inpaint HER2 tumor areas with modified subtypes. Combining the original dataset with the same amount of diffusion-generated images increased the tumor Dice score from 0.833 to 0.854 and almost halved the variance between the HER2 subtype recalls. These results create the basis for more reliable automatic HER2 analysis with lower performance variance between individual HER2 subtypes.
translated by 谷歌翻译
神经网络经常将许多无关的概念包装到一个神经元中 - 一种令人困惑的现象被称为“多疾病”,这使解释性更具挑战性。本文提供了一个玩具模型,可以完全理解多义,这是由于模型在“叠加”中存储其他稀疏特征的结果。我们证明了相变的存在,与均匀多型的几何形状的令人惊讶的联系以及与对抗性例子联系的证据。我们还讨论了对机械解释性的潜在影响。
translated by 谷歌翻译
近年来,深度学习(DL)算法的使用改善了基于视觉的空间应用的性能。但是,生成大量的注释数据来培训这些DL算法已被证明具有挑战性。虽然可以使用合成生成的图像,但在实际环境中测试时,经过合成数据训练的DL模型通常容易受到性能降解。在这种情况下,卢森堡大学的安全,可靠性和信任(SNT)跨学科中心开发了“ SNT Zero-G Lab”,用于在模拟现实世界太空环境的条件下培训和验证基于视觉的空间算法。 SNT Zero-G实验室开发的一个重要方面是设备选择。从实验室开发过程中学到的经验教训,本文提出了一种系统的方法,将市场调查和设备选择的实验分析结合在一起。特别是,本文专注于太空实验室中的图像采集设备:背景材料,相机和照明灯。实验分析的结果表明,在太空实验室开发项目中选择有效的设备选择需要通过实验分析来称赞的市场调查。
translated by 谷歌翻译
在工业机器人附近工作时,人体安全一直是重中之重。随着人类机器人协作环境的兴起,避免碰撞的物理障碍已经消失,增加了事故的风险以及需要确保安全的人类机器人协作的解决方案。本文提出了一个安全系统,该安全系统实现速度和分离监控(SSM)的操作类型。为此,根据工业协作机器人的当前标准,在机器人的工作区中定义了安全区域。基于深度学习的计算机视觉系统可检测,轨道和估计机器人附近的操作员的3D位置。机器人控制系统接收操作员的3D位置,并在模拟环境中生成其3D表示。根据检测到最接近操作员的区域,机器人停止或更改其工作速度。呈现人类和机器人相互作用的三种不同操作模式。结果表明,基于视觉的系统可以正确检测和分类操作员的安全区域,并且不同提出的操作模式确保机器人的反应和停止时间在所需的时间限制之内以确保安全性。
translated by 谷歌翻译
具有通用机器人臂的外星漫游者在月球和行星勘探中具有许多潜在的应用。将自主权引入此类系统是需要增加流浪者可以花费收集科学数据并收集样本的时间的。这项工作调查了深钢筋学习对月球上对象的基于视觉的机器人抓握的适用性。创建了一个具有程序生成数据集的新型模拟环境,以在具有不平衡的地形和严酷照明的非结构化场景中训练代理。然后,采用了无模型的非政治演员 - 批评算法来端对端学习,该策略将紧凑的OCTREE观察结果直接映射到笛卡尔空间中的连续行动。实验评估表明,与传统使用的基于图像的观测值相比,3D数据表示可以更有效地学习操纵技能。域随机化改善了以前看不见的物体和不同照明条件的新场景的学识关系的概括。为此,我们通过评估月球障碍设施中的真实机器人上的训练有素的代理来证明零射击的SIM到现实转移。
translated by 谷歌翻译
社交媒体帖子包含有关医疗条件和与健康相关行为的潜在有价值的信息。生物重建VII任务3专注于通过识别推文中的药物和膳食补充剂的提及来挖掘这些信息。我们通过精细调整多个BERT样式语言模型来执行此任务以执行令牌级分类,并将它们组合成集合以生成最终预测。我们最好的系统由五个Megatron-Bert-345M型号组成,在看不见的测试数据上实现了0.764的严格F1得分。
translated by 谷歌翻译